The role of epigenetics in rare diseases

Abstract Epigenetic control systems are based on chromatin modifications (DNA methylation, histone modifications and nucleosome positioning), which affect the local kinetics of gene expression. They play an important role in maintaining cell fate decisions, X inactivation and genomic imprinting. Aberrant chromatin states that are associated with a deleterious change in gene expression are called epimutations. An epimutation can be a primary epimutation that has occurred in the absence of any genetic change or a secondary epimutation that results from a mutation of a cis-acting regulatory element or trans-acting factor. Epimutations may play a causative role in disease, for example in imprinting disorders, or may be part of the pathogenetic mechanism as in the fragile X syndrome and in syndromes caused by a mutation affecting a chromatin modifier. For several diseases, DNA methylation testing is an important tool in the diagnostic work-up of patients.


Introduction
Many genetic syndromes arise from errors in cell differentiation and embryonic development.Cell fate decisions and cell lineage identity are determined by transcription factors, which activate or repress specific genes, and are stabilized by enzymes that modify chromatin and the local kinetics of gene expression.The major chromatin modifications are the methylation of cytosine residues within CpG dinucleotides of the DNA, histone modifications (mainly acetylation, methylation, phosphorylation and ubiquitination of specific amino acid side chains) and nucleosome positioning.These modifications and the proteins that make, read and erase them are known as "epigenetic control system".In several cases such as X-inactivation, long non-coding RNAs are also involved.While the term "epigenetics" was coined by Conrad Hal Waddington in 1942 to describe the role of genes in development [1], the term "epigenetic control system" was introduced in 1958 by David Nanney [2], although at that time the molecular mechanisms were unknown.Nanney chose the term "epigenetic" to "emphasize the reliance of these systems on the genetic systems and to underscore their significance in developmental processes".Most importantly, he pointed out that "certain patterns of expression, although specifically induced, may be perpetuated in the absence of the inducing conditions".According to Nanney, such a "cellular memory" ensures that differentiated cells maintain their phenotype even through multiple rounds of cell divisions.
Among the many chromatin modifications known today, only the methylation of DNA, the methylation of histone H3 at lysine 9 (H3K9me3) and the methylation of histone H3 at lysine 27 (H3K27me3), all of which are hallmarks of repressive chromatin, can be replicated along with the DNA, because there are chromatin modifiers that can read and copy these modifications (Fig. 1).It is a matter of debate whether nucleosomes remember where they were before DNA replication [3].Thus, strictly speaking, only DNA, H3K9 and H3K27 methylation (and possibly the position of nucleosomes) contribute to the cellular memory, although in common parlance all chromatin modifications are subsumed under the term "epigenetics".Repressive chromatin does not only maintain the silent states of genes that have been switched off during development, but also serves to maintain the silencing of one allele in X inactivation and genomic imprinting.

X inactivation and X-linked diseases
In female mammals, one X chromosome is inactivated to ensure that levels of X-linked genes are equal between XY males and XX females, although 15-30 % of X-linked genes appear to escape inactivation.In females with a normal karyotype, X inactivation is random with regard to the maternal and paternal X chromosome (the situation is different in the case of an X-autosome translocation).It is initiated at the X inactivation centre (Xic) [4], which harbours the Xist gene (X-inactive specific transcript).Xist RNA is transcribed from the X chromosome that has been selected to become inactivated and coats this chromosome.X inactivation involves an extensive change of chromatin modifications: active histone marks are removed and repressive histone marks are deposited.Later, DNA methylation plays an important role in maintaining the inactive state.
Since females are mosaics with cells in which the paternal X chromosome and cells in which the maternal X chromosome is inactivated, they are less susceptible to X-linked disorders.First, a pathogenic variant is not expressed in all cells.Second, cells expressing a pathogenic variant can receive the protein from cells that express the wildtype allele.Third, cell expressing a pathogenic variant may be outgrown by cells expressing the normal allele.Thus, the clinical manifestations of an X-linked pathogenic variant are milder or non-existent.However, there some X-linked diseases which are lethal in males and manifest in females.

Genomic imprinting and imprinting disorders
The human genome harbors more than 100 genes that are expressed from either the paternal or the maternal allele only.These genes are subject to genomic imprinting, which is an epigenetic process by which the male and the female germ line confer a specific mark (mainly DNA methylation) onto these loci.Most of the imprinted genes occur in clusters, which are under the control of an imprinting control centre.In humans, clinically relevant imprinted gene clusters have been found on chromosomes 6, 7, 11, 14, 15, 16 and 20.

Epimutations
Aberrant chromatin states that are associated with a deleterious change in the kinetics of gene expression are called "epimutations".An epimutation can be a primary epimutation that has occurred in the absence of any genetic change or a secondary epimutation that results from a mutation in a cis-acting regulatory element or trans-acting factor [6]. Epimutations may play a causative role in disease, for example in imprinting disorders, may be part of the pathogenetic mechanism as in the fragile X syndrome and in chromatin modifier syndromes or just reflect a disease state (Fig. 2).The Polycomb repressive complex (PRC2) copies H3K27me3 from old nuclesomes (grey balls) to new nucleosomes (white balls).PRC2 consists of EED (Embryonic ectoderm protein), which reads H3K27me3, EZH2 (Enhancer of zeste 2), which methylates H3K27, and SUZ12.H3K9me3 is read and copied by SUV39H1/2 (suppressor of variegation 3-9 homolog 1; not shown).Straight arrow, direction of movement of replication fork.

Epimutations in imprinting disorders
Epimutations in imprinting disorders result from an error in imprint erasure in a primordial germ cell, an error in imprint establishment in a maturing germ cell or an error in imprint maintenance in an early embryonic cell.Primary epimutations affecting an imprinted region are rare stochastic events and a direct cause of an imprinting disorder.Since they do not result from a genetic defect, they are not associated with an increased recurrence risk.Secondary epimutations result from imprinting center mutations affecting in cis the establishment or maintenance of an imprint.In familial cases, the recurrence risk is 50 %, but the penetrance depends on the sex of the mutation carrier.Mutations of the AS imprinting control center on chromosome 15, for example, which is necessary for the establishment of the maternal imprint in 15q11q13, is silently transmitted through the paternal germline, but causes AS when transmitted through the maternal germline [7].
Many cases of MLID are caused by a mutation in ZFP57, which encodes a transcriptional repressor.ZFP57 binds to the methylated hexanucleotide motif TGCmCGC in imprinting control regions to maintain DNA methylation after fertilization.MLID may also be caused by genetic variants of maternal effect genes such as NLRP5, which encode proteins supplied by the oocyte and required for early embryonic development, but this needs to be confirmed.Multi-locus imprinting errors are an example of secondary epimutations caused by a genetic defect in a trans-acting factor.
DNA methylation testing is a major component of the diagnostic work-up of patients suspected of having an imprinting disorder.
Apart from imprinting disorders, primary epimutations are very rare.Most of the epimutations in non-imprinting disorders are secondary epimutations caused by a genetic variant in cis or, more often, by a genetic variant in trans.

Epimutations caused by a genetic variant in cis
Several genetic variants can cause an epimutation in cis.In this case, the epimutation is not the cause of the disease, but part of the pathogenetic mechanism.Here I discuss in more detail two diseases, in which an expansion and a contraction of a tandem repeat sequence affects the epigenetic state of a gene: the fragile X mental retardation syndrome 1 (FMR1) and the facioscapulohumeral muscular dystrophy (FSHD).
FMR1 is an X-linked disease caused by the expansion of an unstable trinucleotide repeat (CGG) within the promoter and exon 1 region of the FMR1 gene [8].The number of CGGs varies in the human population.Trinucleotide repeats with 55 and more copies are unstable and expand to several hundred copies during the proliferation of the diploid oogonia in the foetal ovary.Repeat expansions of up to 200 copies are called premutations.Premutations do not cause FMR1, but increase the risk for Fragile X Tremor Ataxia Syndrome (FXTAS) and Fragile X Premature Ovarian Insufficiency (FXPOI).After fertilization of an oocyte carrying more than 200 copies, the CGG repeat and the FMR1 gene promoter are methylated.DNA methylation and the establishment of repressive chromatin in this region silence the FMR1 gene.
FSHD is an autosomal dominant disorder that has been linked to a 3.3 kb tandemly repeated sequence (D4Z4) in the subtelomeric region of the long arm of chromosome 4, each containing a copy of the double homeobox protein 4 (DUX4).In normal individuals the number of D4Z4 repeats varies between 11 and 150 units, and expression of DUX4 is inhibited by DNA methylation.Patients with FSHD have fewer than 11 repeats and these are not methylated.Contraction of the repeats is accompanied by loss of DNA methylation and unsilencing of DUX4 [9].DUX4 is a pioneer transcription factor that acts on several target genes to cause FSHD.One form of FSHD (FSHD4) can also result from a mutation in DNMT3B (see Table 1).
There are also several examples in which a genetic mutation in one gene affects the epigenetic state of an adjacent gene.This has been observed in some patients with Lynch syndrome [10], α-thalassemia [11] and combined methylmalonic acidemia and homocystinuria, cblC type [12].In these cases, transcriptional readthrough from a mutant neighboring gene has caused silencing and methylation of the disease gene.

Epimutations caused by a genetic variant in trans
The epigenetic system comprises more than 100 proteins involved in making, reading and erasing chromatin modifications.Among them are four members of the DNA methyltransferase (DNMT) family, three Fe(II)/α-ketoglutarate-dependent dioxygenases (TETs), which erase DNA methylation, six methyl-CpG-binding proteins, >20 histone acetyltransferases (HATs) and other factors with HAT activity, 18 histone deacetylases (HDACs), 20 lysine methyltransferases (KMTs), eight lysine demethylases (KDMs), nine protein arginine methyltransferases (PRMTs), several other histone modifying enzymes (kinases, dephosphorylases, ubiquitinases, deubiquitinases etc.) and >20 nucleosome remodelling factors.A mutation affecting any of these factors leads to aberrant chromatin states and altered kinetics of gene expression at very many loci, although only some of these loci may be disease-relevant.As of 2023, more than 70 genetic syndromes caused by a mutation in a chromatin modifier have been recognized ( [13] and Table 1).Depending on the gene and the function of the gene product, a mutation results in the loss or gain of function.The diseases follow autosomal dominant, autosomal recessive or X-linked inheritance, although most cases are sporadic.
Weaver syndrome is a good example of this class of diseases.It is caused by a mutation in the EZH2 gene, which codes for a H3K27 histone methyltransferase, which is part of the Polycomb repressive complex 2 (PRC2; see Fig. 1 and Table 1).Mutations in the other two PRC2 components (EED and SUZ12) cause the "Weaver-like" Cohen-Gibson syndrome and Imagawa-Matsumoto syndrome, respectively (Table 1).
Chromatin modifiers cannot read DNA sequence context, but are recruited to specific genomic regions by transcription factors, which recognize and bind to their cognate sequence element.Pioneer factors can even bind to and open repressive chromatin.In fact, the genome-wide patterns of DNA methylation, histone modifications and nucleosome position are primarily shaped by transcription factors [14,15].Mutations affecting transcription factors also affect the epigenome, because absent or dysfunctional transcription factors do not only fail to activate or repress their target genes, but also fail to recruit chromatin modifiers to their target loci.In these cases, aberrant chromatin states reflect rather than contribute to disease.Therefore, I do not cover transcription factor diseases in this review.

Episignatures
Microarray-based genome-wide DNA methylation scans of peripheral blood DNA have revealed unique DNA methylation patterns called "episignatures" in more than 60 genetic syndromes caused by mutations in genes encoding chromatin modifiers and transcription factors, although many of the affected proteins are not involved in methylating or demethylating DNA [16].For an episignature it does not matter if the changes in DNA methylation cause disease, are part of the pathogenetic mechanism or just reflect the disease state.At individual CpG sites, the changes are small, which indicates that only a few cells are affected or that there is a change in the cell mixture distribution.Based on the methylation levels of ~150 CpGs that have been found to be the most representative CpGs for a given disease, disease-specific diagnostic classifiers have been developed, which can aid the diagnosis of a rare disease and the classification of variants of unknown significance (VUS).

Conclusions and Outlook
Aberrant chromatin states (epimutations) play an important role in a number of rare diseases.For several diseases, DNA methylation analysis can help to confirm or exclude a clinical diagnosis.In diseases caused by a mutation in a chromatin modifier it remains to be elucidated which are the disease-relevant loci that are affected by a secondary epimutation and how the dysregulation of these loci leads to disease.It also remains to be shown whether newly developed tools such as epigenome editing with modified CRISPR-Cas enzymes, which, for example, allow to add or remove methyl groups to and from CpG dinucleotides in specific gene regulatory regions, can be developed into therapeutic approaches [17].In view of the fact that only a few chromatin modifications are mitotically stable (see Introduction) and that cell fate decisions made during early development cannot be undone in childhood or adulthood, I am rather skeptical about the clinical application of this approach in rare diseases.Probably, other avenues have to be explored.

5 5 'C 3 Figure 1 :
Figure 1: Replication of the DNA methylation and H3K27me3.A) UHRF1 (Ubiquitin Like With PHD And Ring Finger Domains 1) and DNMT1 (DNA methyltransferase 1) are recruited to the replication fork and recognize hemimethylated CpGs.DNMT1 methylates the CpG dinucleotides in the newly synthesized DNA.B) Nucleosomes are disassembled and reassembled on the two daughter strands, along with new histones.The Polycomb repressive complex (PRC2) copies H3K27me3 from old nuclesomes (grey balls) to new nucleosomes (white balls).PRC2 consists of EED (Embryonic ectoderm protein), which reads H3K27me3, EZH2 (Enhancer of zeste 2), which methylates H3K27, and SUZ12.H3K9me3 is read and copied by SUV39H1/2 (suppressor of variegation 3-9 homolog 1; not shown).Straight arrow, direction of movement of replication fork.

Figure 2 :
Figure2: Classification of epimutations.Epimutations are primary or secondary epimutations.Primary epimutations are one of several causes of imprinting disorders.Other causes are a mutation in a cis-regulatory element or a trans-acting factor (see text).Fragile X syndrome and facioscapulohumeral muscular dystrophy are caused by the expansion or contraction of a tandem repeat sequence; above or below a certain number of repeats, the DNA is methylated or demethylated (→ secondary epimutation in cis).Weaver syndrome is an example of diseases involving secondary epimutations caused by a mutation in a trans-acting factor.

Table 1 : Secondary epimutations caused by mutations affecting chromatin modifiers.
[13] that many histone modifying enzymes also modify non-histone protein.Table reproduced and modified from[13]under the terms of the Creative Commons Attribution License from (© Fu, Merrill, Gibson, Turvey and Kobor).mC, 5-methylcytosine